Tenth Meeting of the ACL Special Interest Group on Computational Morphology and Phonology

نویسندگان

Jason Eisner

Jeffrey Heinz

چکیده

The performance of automatic speech recognition systems varies widely across different contexts. Very good performance can be achieved on single-speaker, large-vocabulary dictation in a clean acoustic environment, as well as on very small vocabulary tasks with much fewer constraints on the speakers and acoustic conditions. In other domains, speech recognition is still far from usable for real-world applications. One domain that is still elusive is that of spontaneous conversational speech. This type of speech poses a number of challenges, such as the presence of disfluencies, a mix of speech and non-speech sounds such as laughter, and extreme variation in pronunciation. In this talk, I will focus on the challenge of pronunciation variation. A number of analyses suggest that this variability is responsible for a large part of the drop in recognition performance between read (dictated) speech and conversational speech. I will describe efforts in the speech recognition community to characterize and model pronunciation variation, both for conversational speech and in general. The work can be roughly divided into several types of approaches, including: augmentation of a phonetic pronunciation lexicon with phonological rules; the use of large (syllableor word-sized) units instead of the more traditional phonetic ones; and the use of smaller units, such as distinctive or articulatory features. Of these, the first is the most thoroughly studied and also the most disappointing: Despite successes in a few domains, it has been difficult to obtain significant recognition improvements by including in the lexicon those phonetic pronunciations that appear to exist in the data. In part as a reaction to this, many have advocated the use of a “null pronunciation model,” i.e. a very limited lexicon including only canonical pronunciations. The assumption in this approach is that the observation model—the distribution of the acoustics given phonetic units—will better model the “noise” introduced by pronunciation variability. I will advocate an alternative view: that the phone unit may not be the most appropriate for modeling the lexicon. When considering a variety of pronunciation phenomena, it becomes apparent that phonetic transcription often obscures some of the fundamental processes that are at play. I will describe approaches using both larger and “smaller” units. Larger units are typically syllables or words, and allow greater freedom to model the component states of each unit. In the class of “smaller” unit models, ideas from articulatory and autosegmental phonology motivate multi-tier models in which different features (or groups of features) have semi-independent behavior. I will present a particular model in which articulatory features are represented as variables in a dynamic Bayesian network. Non-phonetic pronunciation models can involve significantly different model structures than those typically used in speech recognition, and as a result they may also entail modifications to other components such as the observation model and training algorithms. At this point it is not clear what the “winning” approach will be. The success of a given approach may depend on the domain or on the amount and type of training data available. I will describe some of the current challenges and ongoing work, with a particular focus on the role of phonological theories in statistical models of pronunciation (and vice versa?).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Computing and Historical Phonology Proceedings of the Ninth Meeting of the ACL Special Interest Group in Computational Morphology and Phonology

We introduce the proceedings from the workshop ‘Computing and Historical Phonology: 9th Meeting of the ACL Special Interest Group for Computational Morphology and Phonology’.

متن کامل

Welcome to the ACL Workshop on Computing and Historical Phonology, the 9th Meeting of ACL Special Interest Group for Computational Morphology and Phonology, a meeting held in conjunction with the 45th Meeting of the ACL

We introduce the proceedings from the workshop ‘Computing and Historical Phonology: 9th Meeting of the ACL Special Interest Group for Computational Morphology and Phonology’.

متن کامل

Computing and Historical Phonology

We introduce the proceedings of the workshop ‘Computing and Historical Phonology: 9th Meeting of ACL Special Interest Group for Computational Morphology and Phonology’.

متن کامل

Computational Phonology: Third Meeting of the ACL Special Interest Group in Computational Phonology, SIGPHON@EACL 1997, Madrid, Spain, July 12, 1997

متن کامل

Finite-State Phonology Proceedings of the Fifth Workshop of the ACL Special Interest Group in Computational Phonology

Finite-state morphology in the general tradition of the Two-Level and Xerox implementations has proved very successful in the production of robust morphological analyzer-generators, including many large-scale commercial systems. However, it has long been recognized that these implementations have serious limitations in handling non-concatenative phenomena. We describe a new technique for constr...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

Tenth Meeting of the ACL Special Interest Group on Computational Morphology and Phonology

نویسندگان

چکیده

منابع مشابه

Computing and Historical Phonology Proceedings of the Ninth Meeting of the ACL Special Interest Group in Computational Morphology and Phonology

Welcome to the ACL Workshop on Computing and Historical Phonology, the 9th Meeting of ACL Special Interest Group for Computational Morphology and Phonology, a meeting held in conjunction with the 45th Meeting of the ACL

Computing and Historical Phonology

Computational Phonology: Third Meeting of the ACL Special Interest Group in Computational Phonology, SIGPHON@EACL 1997, Madrid, Spain, July 12, 1997

Finite-State Phonology Proceedings of the Fifth Workshop of the ACL Special Interest Group in Computational Phonology

عنوان ژورنال:

اشتراک گذاری